智能论文笔记

BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model

Brooke Stephenson , Laurent Besacier , Laurent Girin , Thomas Hueber

分类：自然语言处理

2022-07-04

最近的一些研究测试了变压器语言模型表示的使用来推断文本到语音综合（TTS）的韵律特征。尽管这些研究总体上探讨了韵律，但在这项工作中，我们专门研究了对对比的个人代词的预测。这是一项特别具有挑战性的任务，因为它通常需要语义，话语和/或务实的知识才能正确预测。我们收集包含对比焦点的话语语料库，并评估了BERT模型的准确性，该模型的准确性是在这些样品上预测量化的量化声学突出特征。我们还研究了过去的话语如何为该预测提供相关信息。此外，我们评估了以声音突出特征为条件的TTS模型中代词突出性的可控性。

translated by 谷歌翻译

Machine Learning Coarse-Grained Potentials of Protein Thermodynamics

Maciej Majewski , Adrià Pérez , Philipp Thölke , Stefan Doerr , Nicholas E. Charron , Toni Giorgino , Brooke E. Husic , Cecilia Clementi , Frank Noé , Gianni De Fabritiis

分类：机器学习

2022-12-14

A generalized understanding of protein dynamics is an unsolved scientific problem, the solution of which is critical to the interpretation of the structure-function relationships that govern essential biological processes. Here, we approach this problem by constructing coarse-grained molecular potentials based on artificial neural networks and grounded in statistical mechanics. For training, we build a unique dataset of unbiased all-atom molecular dynamics simulations of approximately 9 ms for twelve different proteins with multiple secondary structure arrangements. The coarse-grained models are capable of accelerating the dynamics by more than three orders of magnitude while preserving the thermodynamics of the systems. Coarse-grained simulations identify relevant structural states in the ensemble with comparable energetics to the all-atom systems. Furthermore, we show that a single coarse-grained potential can integrate all twelve proteins and can capture experimental structural features of mutated proteins. These results indicate that machine learning coarse-grained potentials could provide a feasible approach to simulate and understand protein dynamics.

translated by 谷歌翻译

Provably Reliable Large-Scale Sampling from Gaussian Processes

Anthony Stephenson , Robert Allison , Edward Pyzer-Knapp

分类： (统计)机器学习 | 机器学习

2022-11-15

When comparing approximate Gaussian process (GP) models, it can be helpful to be able to generate data from any GP. If we are interested in how approximate methods perform at scale, we may wish to generate very large synthetic datasets to evaluate them. Na\"{i}vely doing so would cost \(\mathcal{O}(n^3)\) flops and \(\mathcal{O}(n^2)\) memory to generate a size \(n\) sample. We demonstrate how to scale such data generation to large \(n\) whilst still providing guarantees that, with high probability, the sample is indistinguishable from a sample from the desired GP.

translated by 谷歌翻译

Fast Benchmarking of Accuracy vs. Training Time with Cyclic Learning Rates

Jacob Portes , Davis Blalock , Cory Stephenson , Jonathan Frankle

分类：机器学习

2022-06-02

Benchmarking the tradeoff between neural network accuracy and training time is computationally expensive. Here we show how a multiplicative cyclic learning rate schedule can be used to construct a tradeoff curve in a single training run. We generate cyclic tradeoff curves for combinations of training methods such as Blurpool, Channels Last, Label Smoothing and MixUp, and highlight how these cyclic tradeoff curves can be used to evaluate the effects of algorithmic choices on network training efficiency.

translated by 谷歌翻译

General Board Geometry

Cameron Browne , Éric Piette , Matthew Stephenson , Dennis J. N. J. Soemers

分类：人工智能

2021-11-22

基于平铺，形状和图形运算符，通过其底层图描述了Ludii General Game系统的游戏板，自动检测图形元素，方向和径向序列之间的拓扑关系等重要属性。这种方法允许简单而简洁地描述最能实现的游戏板。

translated by 谷歌翻译

Disaster mapping from satellites: damage detection with crowdsourced point labels

Danil Kuzin , Olga Isupova , Brooke D. Simmons , Steven Reece

分类：计算机视觉 | 机器学习

2021-11-05

灾难事件后立即可用的高分辨率卫星图像对于响应计划至关重要，因为它促进了对临界基础设施状态的广泛情境意识，例如建立损坏，洪水和障碍物来访问路线。此规模的损坏映射将需要数百人的专家小时。然而，众包的组合和深度学习的最新进步将实时降低几个小时需要的努力。要求志愿者放置点标记，而不是实际受损区域的形状，显着降低灾难期间响应所需的分析时间。但是，不同的志愿者可能在标记中不一致。这项工作提出了用于汇总可能不一致的损伤标记以培训神经网络损伤探测器的方法。

translated by 谷歌翻译

Optimised Playout Implementations for the Ludii General Game System

Dennis J. N. J. Soemers , Éric Piette , Matthew Stephenson , Cameron Browne

分类：人工智能

2021-11-04

本文介绍了三种不同的播出优化实现，如Monte-Carlo树搜索等游戏播放算法常用。每个优化的实现都仅适用于根据其规则的特定游戏集。Ludii General游戏系统可以根据游戏的描述在其常规游戏描述语言中，是否适用任何优化的实现。经验评估展示了标准实施中的主要加速，其中运行播出的中位结果是快速的播出5.08倍，在Ludii中超过145个不同的游戏，其中一个优化的实现是适用的。

translated by 谷歌翻译

Deeptime: a Python library for machine learning dynamical models from time series data

Moritz Hoffmann , Martin Scherer , Tim Hempel , Andreas Mardt , Brian de Silva , Brooke E. Husic , Stefan Klus , Hao Wu , Nathan Kutz , Steven L. Brunton

分类：机器学习 | (统计)机器学习

2021-10-28

时间序列数据的生成和分析与许多从经济学到流体力学的定量字段相关。在物理科学中，诸如亚稳态和连贯的组的结构，慢松弛过程，集体变量显性过渡途径或歧管流动流动的概率流动可能非常重视理解和表征系统的动力动力学和机械性质。 Deeptime是一种通用Python库，提供各种工具来估计基于时间序列数据的动态模型，包括传统的线性学习方法，例如马尔可夫状态模型（MSM），隐藏的马尔可夫模型和Koopman模型，以及内核和深度学习方法如vampnets和深msms。该库主要兼容Scikit-Searn，为这些不同的模型提供一系列估计器类，但与Scikit-Ge劳说相比，还提供了深度模型类，例如，在MSM的情况下，提供了多种分析方法来计算有趣的热力学，动力学和动态量，例如自由能，松弛时间和过渡路径。图书馆专为易于使用而设计，而且易于维护和可扩展的代码。在本文中，我们介绍了Deeptime软件的主要特征和结构。

translated by 谷歌翻译

Practical Galaxy Morphology Tools from Deep Supervised Representation Learning

Mike Walmsley , Anna M. M. Scaife , Chris Lintott , Michelle Lochner , Verlon Etsebeth , Tobias Géron , Hugh Dickinson , Lucy Fortson , Sandor Kruk , Karen L. Masters

分类：计算机视觉

2021-10-25

天文学家通常已经着手通过从头开始创建自己的表示来解决监督的机器学习问题。我们表明，经过训练的深度学习模型，可以回答每个星系动物园贴花问题问题，即学习星系的有意义的语义表示，这些语义表示对于从未训练过的新任务很有用。我们利用这些表示形式优于最近对研究大型星系样本至关重要的实际任务的方法。第一个任务是识别与查询星系相似的形态的星系。给定一个星系为人类分配了一个免费文本标签（例如“ #diffuse”），我们可以找到与大多数标签匹配该标签的星系。第二个任务是确定特定研究人员最有趣的异常。我们的方法在识别最有趣的100个异常（由Galaxy Zoo 2志愿者判断）方面是100％准确的。第三个任务是调整模型来仅使用少数新标记的星系解决新任务。与从陆地图像（ImageNet）或从头开始训练的模型相比，从我们的表示形式进行微调的模型可以更好地识别环形星系。我们用很少的新标签解决每个任务；一个（用于相似性搜索）或数百个（用于异常检测或微调）。这挑战了长期以来的观点，即深度监督方法需要新的大型标签数据集，以便在天文学中实际使用。为了帮助社区受益于我们验证的模型，我们发布了我们的微调代码Zoobot。没有先前经验的研究人员可以访问Zoobot。

translated by 谷歌翻译

Galaxy Zoo DECaLS: Detailed Visual Morphology Measurements from Volunteers and Deep Learning for 314,000 Galaxies

Mike Walmsley , Chris Lintott , Tobias Geron , Sandor Kruk , Coleman Krawczyk , Kyle W. Willett , Steven Bamford , Lee S. Kelvin , Lucy Fortson , Yarin Gal

分类：计算机视觉

2021-02-16

我们介绍了Galaxy动物园贴花：SDSS DR8占地面积的星系中的黑色能量相机传统调查图像的详细视觉形态学分类。更深的贴花图像（R = 23.6与SDSS的r = 22.2）显示螺旋臂，弱杆和在SDSS成像中未见的潮汐功能。为了最佳利用较大的贴花图像，志愿者从一套新的答案中选择，旨在提高对合并和酒吧的敏感性。 Galaxy动物园志愿者提供750万个单独的分类超过314,000个星系。 140,000个星系收到至少30分类，足以准确测量像条状的详细的形态，其余的收到约5.所有分类都用于培训贝叶斯卷积神经网络的集合（一种最先进的深度学习方法）预测所有314,000个星系的详细形态的后海外。当衡量自信的志愿者分类时，每个问题的网络大约有99％。形态学是每个星系的基本特征;我们的人机和机器分类是理解星系如何发展的准确和详细资源。

translated by 谷歌翻译